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We study the temporal evolution of the structure of the world's largest subway networks in an 
exploratory manner. We show that, remarkably, all these networks converge to a shape which shares 
similar generic features despite their geographic and economic differences. This limiting shape is 
made of a core with branches radiating from it. For most of these networks, the average degree 
of a node (station) within the core has a value of order 2.5 and the proportion of /c = 2 nodes 
in the core is larger than 60%. The number of branches scales roughly as the square root of the 
number of stations, the current proportion of branches represents about half of the total number 
of stations, and the average diameter of branches is about twice the average radial extension of the 
core. Spatial measures such as the number of stations at a given distance to the bary center display 
a first regime which grows as followed by another regime with different exponents, and eventually 
saturates. These results - difficult to interpret in the framework of fractal geometry - confirm and 
yield a natural explanation in the geometric picture of this core and their branches: the first regime 
corresponds to a uniform core, while the second regime is controlled by the interstation spacing on 
branches. The apparent convergence towards a unique network shape in the temporal limit suggests 
the existence of dominant, universal mechanisms governing the evolution of these structures. 



INTRODUCTION 

Transportation systems, especially mass transit, are an 
important component in cities and their expansion. In 
a world where more than 50% of the population lives 
in urban areas [l], and where individual transportation 
increases in cost as cities grow larger, mass transit and in 
particular, subway networks, are central to the evolution 
of cities, their spatial organization [2]-[4] and dynamical 
processes occurring in them [51 16]. The percentage s{P) 
of cities with a subway system versus their population 
size P is shown in Fig. [l] (the data were obtained for cities 
with population larger than 10^ [T) which confirms that 
the larger a city, the more likely it is to have some form 
of mass transit system (see also [8 ). Approximately 25% 
of the cities of more than one million individuals have a 
subway system, 50% of those of more than two millions, 
and all those above 10 millions have a subway system (as 
an indication, an exponential fit of the plot in Fig.[T]gives 
s{P) = 1 — exp(— P/Po) where the typical population Pq 
is of order 3 millions). 

For some cities, subway systems have existed for more 
than a century. Fascination with the apparent diversity 
of their structure has led to many studies and to partic- 
ular abstractions of their representation in the design of 
idealized transit maps [9 , and although these might ap- 
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FIG. 1: Percentage of cities with a subway system versus the 
population (data from the UN [7|). 



pear to be planned in some centralized manner, it is our 
contention here that subway systems like many other fea- 
tures of city systems evolve and self-organize themselves 
as the product of a stream of rational but usually unco- 
ordinated decisions taking place through time. 

Generally speaking, subway systems have been devel- 
oped to improve movement in urban areas and to reduce 
congestion. The early history of subways is sometimes 
connected to large scale planning, for instance with the 
need to bring population from a growing periphery to 
the center where traditionally production and exchange 
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usually take place. More broadly, it might seem that 
subway systems are engineered systems and intention- 
ally structured in a core/periphery shape with their self- 
organization thus playing only a very minor role. This 
actually would be true if these subway systems were 
planned from their beginning to their current shape, but 
this is not the case for most networks. Their shape re- 
sults from multiple actions, from planning within a time 
limited horizon, set within the wider context of the evo- 
lution of the spatial distribution of population and re- 
lated economic activities. We thus conjecture that sub- 
way networks actually result from a superimposition of 
many actions, both at a central level with planning and 
at a smaller scale with the reorganization and regenera- 
tion of economic activity and the growth of residential 
populations. In this perspective, subway systems are 
self-organizing systems, driven by the same mechanisms 
and responding to various geographical constraints and 
historical paths. This self-organized view leads to the 
idea that — beside local peculiarities due to the history 
and topography of the particular system — the topol- 
ogy of world subway networks display general, universal 
features, within the limits of the physical geometry and 
cultural context in which their growth takes place. 

The detection and characterization of these features 
require us to understand the evolution of these spatial 
structures. Indeed, subway networks are spatial [lOl [11] 
in the sense that they form a graph where stations are the 
nodes and links represent rail connections. We now un- 
derstand quite well how to characterize a spatial network 
but we still lack tools for studying their temporal evolu- 
tion. The present article tackles this problem, proposing 
various measures for these time dependent, spatial net- 
works. 

Here we focus on the largest networks in major world 
cities and thus ignore currently developing, smaller net- 
works in many medium-sized cities. We thus consider 
most of the largest metro networks (with at least one 
hundred stations) which exist in major world cities. 
These are: Barcelona, Beijing, Berlin, Chicago, London, 
Madrid, Mexico, Moscow, New York City (NYC), Osaka, 
Paris, Seoul, Shanghai, and Tokyo, for which we show a 
sample in Fig. [2] Additionally, we focus on urban sub- 
way systems and do not consider longer-distance heavy 
and light-rail commuting systems in urban areas, such as 
RER (Reseau Express Regional) in Paris or overground 
NetworkRail in London. 

Static properties of transportation networks have been 
studied for many years [T^ and in particular simple con- 
nectivity properties were studied in [13j while fractal as- 
pects were considered in [14 . With the recent availabil- 
ity of new data, studies of transportation systems have 
accelerated [11] and this is particularly so for subway 
systems [T5H23] . These studies have revealed some sig- 
nificant similarities between different networks, despite 
differences in their historical development and in the cul- 




FIG. 2: A sample of large subway networks in large urban 
areas, all displaying a core and branches structure. From 
left to right and top to bottom: Shanghai, Madrid, Moscow, 
Tokyo, Seoul, Barcelona (Figures from Wikimedia Commons 

tures and economies in which they have been developed. 
In particular, their average shortest path seems to scale 
with the square root of the number of stations and the 
average clustering coefficient is large, consistent with gen- 
eral results associated with two-dimensional spatial net- 
works (see Jl]). In [1^, a strong correlation between the 
number of stations (for bus and tramway systems) and 
population size has been observed for 22 Polish cities, 
but such correlation are not observed at the world level 
(for all public transportation modes [ST). 

Our empirical analysis of the evolution of these trans- 
portation networks is in line with approaches developed 
in the 1970's (see [24] and references therein) but we take 
advantage here of recent progress made in the under- 
standing of spatial networks in general and new historical 
data sources which provide us with detailed chronologies 
of how these networks have developed. 



Data 

The network topologies at various points in time were 
built using two main data sources. First, current network 
maps as for 2009 were used to define lines for each net- 
work, and then to define line-based topologies, i.e. which 
station(s) follow(s) which other station(s) on each line. 
This information was then combined with opening dates 
for lines and stations. This second type of data has been 
gathered from Wikipedia [27]: for most networks, there 
is one page per station with various information, includ- 
ing the first date of operation, the precise location and 
address, number of passengers, etc. The network build- 
ing process for a given year t is then as follows. The list 
of open lines at year t is first established. For each open 
line, open stations at year t are listed and connections 
are created between contiguous stations according to the 
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network topology. A station which is not open at year t 
on the given hne, even if it is already open on a different 
line, is evidently discarded for the construction of the 
line. Eventually, those independent line topologies are 
gathered into the subway graph corresponding to year t. 
Note that we used 2009 topologies as it was relatively 
difficult to find and process network maps for all these 
networks for each year of their existence. As a result, 
topologies for any given year before 2009 may overlook 
topology features pertaining to station or line closures: 
for instance, a station which existed between 1900 and 
1940 and which remained closed until now will not appear 
in any of our network datasets (such is the case for the 
British Museum Tube station). We suggest however that 
the effect of this bias is limited: on one hand, generally 
few stations undergo closure in the course of the network 
evolution; on the other hand, these stations are rarely 
hubs, most often intermediary stations (of degree two, 
i.e. connected to two stations), thus their non-inclusion 
bears little topological impact. 

EXPLORING STATIC PROPERTIES 

The main characteristics of the networks we have cho- 
sen are shown in Table [l] where we first observe that the 
number of different lines appears to increase incremen- 
tally with the number of stations and that on average for 
these world networks, there are approximately 18 sta- 
tions per line. Also, the mean interstation distance is 
on average ii ~ 1km with Beijing and Moscow show- 
ing the longest ones (1.79kms and 1.67kms, respectively) 
and Paris displaying the shortest one (570 meters), a 
diversity which finds its origin in the different histori- 
cal paths of these networks. Other quantities such as 
the catchment area (the average number of individuals 
served by one station) could be computed but should be 
used with care: residential and economic activity density 
vary strongly across space and back-of-the-envelop argu- 
ments should only serve as a guide. Generally speak- 
ing, many parameters such as the population density, 
land use activity distribution, and traffic are important 
drivers in the evolution of those networks, but we will 
focus in this first study on the characterization of these 
networks in terms of space and topology, independently 
of other socio-economical considerations. A later exten- 
sion of this research could examine these physical and 
topological properties with respect to various definitions 
of density which might include different activity types 
and various combinations related to the traffic that they 
generate. 

In order to get some initial insight into the topology 
of these networks, one can first compare the total length 
It of these networks to the corresponding quantity com- 
puted for an almost regular graph fr^^ with same number 
of stations, area, and average degree (the "degree" of a 



node is the number of its neighbors in a graph). For 
a random planar graph with small degree fluctuations 
{k ^ (k)) and small fluctuations of the spatial distribu- 
tion of nodes, we can consider that the internode spac- 
ing is roughly constant and given by io ~ ^/^/p where 
p = N/A is the density of nodes defined as the number of 
nodes over the total area comprising all the nodes. The 
total length is then the number of edges E = N{k)/2 
times io which leads to [11 

f.^^ - ^Van (1) 

In real applications, the determination of the quantity 
A is a difficult problem, but here we choose to use the 
metropolitan area as given by the various data sources. 
As shown in the Table [l| the ratio ir/i-T^ varies from 0.08 
to 0.88, has an average of order 0.29 and displays essen- 
tially three outliers. First, Osaka (and also Madrid and 
Seoul) has a very large value indicating a highly retic- 
ulated structure. In contrast, Chicago and NYC have 
a much smaller value (~ 0.1) signaling a more hetero- 
geneous structure which in both these cases is probably 
due to their strong geographical constraints. 

The total length and the comparison with a regular 
structure gives a first hint about the structure of these 
networks but other indicators are needed to get a more 
focused view. There exist many different indicators and 
variables that describe these networks and their evolu- 
tion. An important difficulty thus lies in the choice of 
the many possible indicators and how to extract use- 
ful information from them. In addition, the largest net- 
works have a relatively small number of stations (always 
smaller than 500) which implies that we cannot expect 
to extract useful information from the probability distri- 
butions of various quantities as the results are too noisy. 
We thus have to compute more globally structured indi- 
cators which are, however, sensitive to the usually small 
temporal variations associated with these networks. In 
the following, we will focus on a certain number of these 
indicators, which we consider to be the most informative 
at this point. 

Finally, we will focus in this study on purely spatial 
and topological properties: we will consider the evolu- 
tion in space of these subway networks and we will not 
consider any other parameters which might be used to 
characterize urban growth. Our study is exploratory and 
thus a first step towards the integration of the most im- 
portant factors into this research and despite its sim- 
plicity, in that we focus almost entirely on geometrical 
attributes, we consider that the evolution of the topol- 
ogy encodes many different factors and that its study can 
point to some important general mechanisms governing 
the evolution of these networks. 
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TABLE L List of various indicators (for the year 2009) for the major subway networks considered in this study (and sorted 
according to their metro population). P is the metropolitan area population (for 2009). Nl is the number of lines, N the 
number of physical stations, ii is the average inter-station distance, ir total route length, the total route length for a 
regular graph with same average degree, area, and number of stations, and /3 the final ratio between branch and core stations. 



NETWORK DYNAMICS 

In order to get an initial impression of the dynamics of 
these networks, we first estimate the simplest indicator 
V = dN/dt which represents the number of new stations 
built per year. From the instantaneous velocity, we can 
compute the average velocity over all years. This aver- 
age can however be misleading as there are many years 
where no stations are built and thus we describe this by 
the fraction of 'inactivity' time /. We provide results 
for the networks considered in Table [TTl from which some 
interesting facts are revealed. Note that it is clear that 
Shanghai and Seoul are the most recent subway networks 
experiencing a rapid expansion that has elevated them to 
amongst the largest networks in the world. 

For most of these networks the average velocity is in a 
small range (typically v G [1.4, 3.7]) except for Seoul and 
Shanghai which are more recently developed networks. 
This is however an average velocity and we observe that 
(i) for all networks, larger velocities occur at earlier stages 
of the network and (ii) large fluctuations occur from one 
year to another. Interestingly, the fraction of inactivity 
time (i.e. the time when no stations are built) is similar 
for all these networks with an average of about 58%. We 
also show in Fig. |3jA), the time evolution for each city 
of the number of stations, using an absolute time scale. 
In particular, the size of the oldest networks seem to 
progressively reach a plateau. 

To make growth comparable across all networks, we 
introduce a second graph on Fig. [sJB) featuring the av- 
erage, over all networks, of the number of stations after 
a certain number of years since network creation. This 
average quantity exhibits a linear increase which indi- 
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TABLE IL to is the initial year considered here for the dif- 
ferent subways networks, v is the average velocity (number 
of stations built per year), Gv is the standard deviation of 
and / is the fraction of years of inactivity (no stations built). 



cates convincingly that, overall, as these networks be- 
come large, then for a few decades thereafter new sta- 
tions represent an increasingly small percentage of ex- 
isting ones. In other words, the time evolution of all 
these networks is characterized by small additions and 
not by sudden, abrupt changes with a large number of 
stations added in a small time duration. This first result 
anticipates the fact that these large networks may reach 
some kind of limiting shape that we will characterize in 
the next section. This incremental growth of subways 
might reflect socio-economical concerns and pressure on 
the transportation networks such as diminishing return 
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FIG. 3: (A) Evolution of the number of stations for various 
large world subway networks. (B) Evolution of the number 
of stations y years after creation, averaged over all networks 
(tubes mark the standard deviation across all networks). The 
linear shape indicates that the growth in terms of new stations 
from a decade to another goes to zero for all these networks, 
signaling the possible appearance of a stationary limit. 
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FIG. 4: Number of cities with a given number of stations at 
a given time. 



on investments as noted by various authors (see for ex- 
ample [28] for US highways). 

Finally, when we study the evolution of various indica- 
tors versus the number N of station, an important point 
for our statistical analysis is the number of subways with 
a given number at a given time t. We show this quan- 
tity in Fig. [4] and we can see that for N ^ [25, 100] ap- 
proximately this number is the largest (almost 15 — note 
that this figure is nonetheless too small to allow a discus- 
sion of the normality of the various quantities considered 
below). Unfortunately, for larger values of N the number 
of cities is naturally smaller, and at this stage we cannot 
give definitive answers but suggest some limits for large 



Characterization of the core and branches structure 

The large subway networks considered here thus con- 
verge to a long time limit where there is always an 
increasingly smaller percentage of new stations added 
through time. The remarkable point that we will show 
below is that all these networks, despite their geographi- 
cal and economical differences, converge to a shape which 



exhibits several typical topological and spatial features. 
Indeed, by inspection, we observe that in most large ur- 
ban areas, the network consists of a set of stations de- 
limited by a 'ring' that constitute the 'core'. From this 
core, quasi-one dimensional branches grow and reach out 
to areas of the city further and further from the core. 
In Fig. [2j we show a sample of these networks as they 
currently exist. We note here that the ring, which is 
defined topologically as the set of core stations which 
are either at the junction of branches or on the shortest 
geodesic path connecting these junction stations, exists 
or not as a subway line. For instance, for Tokyo, there 
is a such a circular line (called the Yamanote line), while 
for Paris the topological ring does not correspond to a 
single line. It is also worth noting that in those systems 
where the core is harder to define such as NYC where 
physical constraints are strongly manifest (the east and 
west rivers which bound Manhattan), a pseudo core is ev- 
ident where a series of lines coalesce to enable travelers 
to move around the core circumferentially. 

More formally, branches are defined as the set of sta- 
tions which are iteratively built from a 'tail' station, or a 
station of degree 1. New neighbors are added to a given 
branch as long as their degree is 2 - continuing the line, 
or 3 - defining a fork. In this latter case, the aggregative 
process continues if and only if at least one of the two 
possible new paths stemming from the fork is made up 
of stations of degree 2 or less. Note that the core of a 
network with no such fork is thus a /c-core with k = 2 

m- 

The general structure can schematically be represented 
as in Fig. [5] 

We first characterize this branch and core structure 
with the parameter P{t) defined as 



m 



NBit) + Ncit) 



(2) 



where NB{t) and Nc{t) respectively represent the num- 
ber of stations on branches and the number of stations 
in the core at time t. 

We can also characterize a little further the structure 
of branches. Their topological properties are trivial and 
their complexity resides in their spatial structure. We can 
then determine the average distance (in kms) from the 
geographic barycenter of the city to all core and branches 
stations, respectively: Dc(t) and D^it) (the barycenter 
is computed as the center of mass of all stations, or in 
other words, the average location of all the stations) This 
last distance provides information about the spatial ex- 
tension of the branches when we can form the ratio 77 (t) 



(3) 



Dc{t) 

which gives a spatial measure of the amount of extension 
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FIG. 5: Schematic structure of subway networks. A large 
'ring' encircles a core of stations. Branches radiate from 
the core and reach further areas of the urban system. The 
branches are essentially characterized by their size (parame- 
ter /3(t), Eq. 2), and their spatial extension (parameter 77 (t) 
in Eq. 3). The core is characterized by its average degree 
{{kcore){t) defined in Eq. 4) and fraction of nodes of degree 2 
(/2), its number of stations Nc{t) and its size rc{t). 



of the branches. 

We also need information on the structure of the core. 
The core is a planar (which is correct at a good ac- 
curacy for most networks) spatial network and can be 
characterized by many parameters [TT]. It is important 
to choose those which are not simply related but ideally 
represent different aspects of the network (such as those 
proposed in the form of various indicators, see for exam- 
ple [HI [T2I [25] ) . At each time step t, we will characterize 
the core structure by the following two parameters. The 
first parameter is simply the average degree of the core 
which characterizes its 'density' 



2Ec{t) 
Nc{t) 



(4) 



where Nc{t) is the number of core nodes and Ec{t) the 
number of its edges. The average degree is connected to 
the standard index j{t) = Ec{t)/{3Nc{t) —6) where the 
denominator is the maximum number of links admissible 
for a planar network [12 . 

The average degree of the core contains a useful in- 
formation about it, and there are many other quantities 
(such as standard indices such as a, etc., see for example 
[T2] ) which can give additional information. We will use 
another simple quantity which describes in more detail 
the level of interconnections in the core and which is given 
by the fraction /2 of nodes in the core with k = 2. In the 
case of the well-interconnected system, this fraction will 
tend to be small, while sparse cores with a few intercon- 



nections will have a larger fraction of k = 2 nodes. 

Once we know this fraction /2 of A: = 2 nodes in the 
core which characterizes the level of interconnection and 
the parameter 77 (t) which characterizes the relative spa- 
tial extension of branches, we have key information on 
the intertwinement of both topological and geographical 
features in such "core/branch" networks. 

Time evolution of /3, (kcore), f2 and 77 

The historical development of these networks is very 
different from one city to another and representing the 
evolution of a specific quantity versus time would prob- 
ably not be particularly meaningful. Similarly, city net- 
works often experience significant development in some 
particular years, while they experience little or no evo- 
lution for the rest of the time. In order to be able to 
compare the networks across time periods and cities, we 
propose to study their evolution in terms of the number 
of stations N that are constructed. 

We first plot in Fig. |6j A) the parameter /3 as a function 
of N for the networks studied here. It is difficult to draw 
strong conclusions from this plot, but we can bin these 
data and represent the average value of /3 per bin and its 
dispersion as well (Fig. [6]^B)). On this figure we may see 
that the average value of f3 seems to stabilize slowly to 
some value in [0.35,0.55]. 

It is also important to characterize the spatial impor- 
tance of the branches. The parameter 77 gives a pre- 
cious indication about their extension and we show in 
Fig. [t] the evolution of this parameter with N (the data 
is binned). This figure shows that in the interval where 
we have the largest number of subways, the average value 
of 77 is around 2 with relatively large fluctuations which 
seem to decrease with N. 

The parameters (3 and rj give an indication of the im- 
portance of the core but do not say anything about its 
structure. A first structural indication may be given by 
its average degree (/Ccore) and by the percentage /2 of 
nodes in the core having a degree k equal to 2. In par- 
ticular, these two quantities shed light on how intercon- 
nections are created in the core. We display in Fig. (sJA) 
the average degree of the core (/Ccore) which, even if there 
is a slow increase with TV, displays moderate variations 
around 2.4 approximately. 

This value is relatively small and indicates that the 
fraction of connecting stations (i.e. with A: > 2) is also 
small and means that most core stations belong to one 
single line with few that actually allow connections. More 
precisely, we observe in Fig. [sJB) that on average for 
subways with < 100 the fraction of interconnecting 
stations is increasing with A^ - which probably corre- 
sponds to some organization of the subway - but that for 
larger subways (A^ > 100), the percentage /2 is increasing 
again, which probably corresponds to a densification pro- 
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FIG. 6: (A) Parameter /3 as a function of the number of 
stations N for the different world subways. (B) Same as (a) 
but averaged over 20 bins and showing the standard deviation. 



FIG. 8: (A) Average degree of the core (kcove) (Eq. 4) and 
its dispersion versus number of stations (averaged over 20 
bins). (B) Evolution of the percentage /2 of A; = 2 core nodes 
(averaged over 20 bins). 
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FIG. 7: Evolution of the ratio 77, which characterizes the spa- 
tial extension of branches relative to the core. 




FIG. 9: Evolution of the mean distance to the bary center 
(in kms) of core stations with the number of stations N. 



cess without the creation of new interconnections. This 
densification can indeed be confirmed as the diameter of 
the core (see Fig. [9| seems to reach a plateau for most 
cities. 

As noted above, the number of subways with large TV 
is smaller and the statistics therefore less reliable. At 
this point and with this statistical error in mind, we ob- 
serve that the average value P and its dispersion are de- 
creasing with N and it suggests that P could converge 



to some 'limiting' value Poo ^ 45%. The same remarks 
also apply to rj and suggest a limiting value of order 2. 
Concerning the core, the dispersion of (/Ccore) is always 
moderate and approximately constant showing that the 
fluctuations among different networks are also moderate. 
We observe a slow increase of (/Ccore) pointing to a mild 
yet continuing densification of the core, even after a long 
period of time. The fraction of connecting stations has 
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a more complex dynamics and seems to decrease with 
for large networks. In these networks, there is an obvi- 
ous cost associated with the large value of k > 2 and 
such a decreasing fraction could be due to the fact that a 
small fraction is enough to enable easy navigation in the 
network. 

In summary, our results display non negligible fluctu- 
ations but suggest that large subway networks may con- 
verge to a long time limiting network largely independent 
of their historical and geographical differences. So far, 
we can characterize the 'shape' of this long time limiting 
network with values of Poo ^ 45%, t^oo ^ 2, and a core 
made of approximately 80% of non connecting stations. 
It will be interesting to observe the future evolution of 
these networks in order to confirm (or not) our current 
results. 



Balance between the core density and the branch 
structure 



Even if it seems that the values of various indicators 
converge with the size of the networks, we still have ap- 
preciable variations. For example 77 varies from ^ 1 to 
~ 3 and exhibits a relatively constant and not negligible 
relative dispersion. It is thus important to understand 
the remaining differences between these networks. To 
achieve this, we focus on the relation between rj which 
characterizes the spatial extension of the branches rela- 
tive to the core, and the percentage /2 of /c = 2 nodes 
in the core which indicates how well connected the core 
is. We focus on the 'final' values of these parameters 
obtained for 2009 for the various networks and we ob- 



tain the plot shown in Fig. 11 From this figure, we first 



Number of branches 

We now consider the number JVb of different branches. 
A naive argument would be that the number of branches 
is actually proportional to the perimeter of the core struc- 
ture. This implicitly assumes that the distance between 
different branches is constant. In turn, the perimeter 
should roughly scale as a/TV as the core is a relatively 
dense planar graph and contains a number of nodes pro- 
portional to N. These assumptions thus leads to 



Mb 



N 



(5) 



We display the number of branches versus the number 
of stations N for the various networks considered here. 



A power law fit of the data presented in figure 10 gives 




FIG. 10: Loglog plot of the number of different branches 
versus the number of stations for the different subway net- 
works considered here. The dashed line is a power law fit 
with exponent 0.6. 



Afs ^ with b 
argument. 



0.6 (r^ = 0.85) consistent with our 



NYC 

Chicago 
Osaka 



Barcelona 



.Shanghai 
exico ^ •seoul 

Berhn 



^ Madrid 
Paris 
London Tokyo 



FIG. 11: Relation between the spatial extension of branches 
and the degree of interconnection in the core. The 2009 values 
for the percentage /2 of /c = 2 core nodes and 77 are plotted 
for 12 city subways. 

see that /2) ranges from 1.4,?^ 85%) for NYC up 
to 3.3, ^ 45%) for Moscow which is indeed a highly 
ramified network with a very dense core. 

Very roughly speaking, we first observe that for this 
set of the largest subway systems in the world, the per- 
centage /2 is large and above 60% and relatively inde- 
pendent from 77. At a finer level, we observe from this 
figure that clusters of networks with similar properties 
also emerge. The first cluster comprises Beijing, Berlin, 
Shanghai, and Seoul which are remarkably close to each 
other: is of order 80% ± 5% and r]{t) ^ 2.84 ±0.1. 
This cluster corresponds thus to subway networks with a 
large degree of ramification and a lower interconnection 
level in their core. Not surprisingly, this cluster comprises 
rapidly evolving networks such as Beijing and Shanghai 
for example. Another cluster comprises London, Paris 
and Madrid with a smaller value of /2 ~ 70% ±5% which 
might result from their denser city center structure and 
a smaller value of r] ^ 2. This other cluster corresponds 
to denser networks, less ramified but with more inter- 
connections in the core. Finally we can identify another 
cluster made of Chicago and Osaka with a small value of 
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T] and a relatively dense core (with /2 ~ 70%). 

SPATIAL ORGANIZATION OF THE CORE AND 
BRANCHES 

Following earlier studies on the fractal aspects of sub- 
way networks [14 , we can inspect the spatial subway 
organization by considering the number of stations N{r) 
at a distance less than or equal to r, where the origin 
of distances is the barycenter of all stations considered 
as points. Interestingly, the barycenter of all stations is 
almost motionless, except in the case of NYC where the 
barycenter moves from Manhattan to Queens and thus 
we will exclude NYC from further study. Chicago is a 
similar case: the spatial structure of the core is peculiar, 
mainly due to presence of the lake which constrains the 
network from expanding in the other directions. We will 
also exclude this network in this section. It should how- 
ever be noted here that both Chicago and NYC do follow 
the image of core and branches but that the main differ- 
ence with the other networks is that the core of these 
networks has no clear spatial meaning due to the geo- 
graphical constraints (such as the presence of a lake for 
Chicago and a particular land area shape for NYC). 

For the year 2009, the limiting shape made of a core 
and branches implies that there is an average distance 
rc which determines the core. In practice, we can mea- 
sure on the network the size Nc of the core and we then 
define rc such that N{r = rc) = Nc (which assumes 
implicitly an isotropic core shape, which is the case for 
most networks except for the excluded cases of Chicago 
and NYC). For the various cities, we can easily compute 
the function N — N{r) from which we can extract rc 
and we report the results in the Table |III| 



City 


Nc Tc (kms) 


Beijing 


63 


4.4 


Tokyo 


123 


5.0 


Seoul 


243 


11.6 


Mexico 


90 


4.7 


Shanghai 


57 


3.7 


Moscow 


39 


5.9 


London 


142 


7.3 


Paris 


186 


4.2 


Madrid 


113 


4.4 


Berlin 


68 


5.5 


Barcelona 


57 


3.5 


Osaka 


46 


3.6 



TABLE IIL For each city, we compute the number of sta- 
tions in the core (for the year 2009) and from the numerical 
calculation of N(r) we can estimate rc the size of the core 
(in kms) from N(r — rc) — Nc- 

Next, we can rescale r by rc and N{r) by Nc and we 
then obtain the results shown in the Fig. [12] 
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FIG. 12: (A) Rescaled number of stations at distance r from 
the barycenter as a function of the rescaled variable r/rc 
where rc is the size of the core defined N(r — rc) — Nc 
(shown here in loglog) . The dotted line represents a power law 
^ and serves as a guide to the eye. (B) Case of Moscow 
where the two regimes (r < rc and r > rc) with their dif- 
ferent exponents are visible (the dotted lines serve here as a 
guide to the eye). 



This figure displays several interesting features. First, 
the short distance regime r < rc is well described by 
a behavior of the form N(r) ~ pc^^r^ consistent with 
a uniform density pc of core stations. For very large 
distances, we observe for most networks a saturation of 
N{r). The interesting regime is then for intermediate 
distances when r is larger than the core size but smaller 
than the maximum branch size Tmax- This intermediate 
regime is characterized by different behaviors with r. A 
similar result was obtained earlier [14 where the authors 
observed for Paris that N(r > rc) ~ r^-^, a result that 
was at that time difficult to understand in the framework 
of fractal geometry. 



Here we show that these regimes can be understood in 
terms of the core and branches model, with the additional 
factor that the spacing between consecutive stations is 
increasing with r. Within this picture (and assuming 
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isotropy), N{r) is given by 



N{r) 



pcTrr 
N 



tor r < rc 
+ -^i?/rc4^ rc<r<w 

for r > Tmax 

(6) 

where N is the total number of stations, JVb is the num- 
ber of branches and A(r) is the average spacing between 
stations on branches at distance r from the barycenter. 

In order to test this shape, we can determine the var- 
ious parameters of Eq. (|6| — namely A/b, Nc, rc, and 
A(r) — and plot the resulting shape of Eq. ([6| against 
the empirical data. It is easy to determine empirically 
the numbers A/b, Nq^ and rc but the quantity A(r) is 
extremely noisy due to the small number of points (all 
these numbers are determined for the year 2009), espe- 
cially for large values of r closest to rmax, at a distance 
where, often, there is no more than a handful of stations. 

The less noisy situation is obtained in the case of 
Moscow which has long branches and for which we obtain 
a interstation spacing roughly constant. In this case we 
obtain for r > rc a behavior of the form N{r) ~ A/^r 
(see Fig.[l2]D). 

More generally, the large distance behavior rc < r < 
rmax will be of the form 



N{rc <r < Tmax) ^ 



(7) 



where r denotes the exponent governing the interspacing 
decay A(r) ~ r^ . For most networks, the regime rc < 
r < rmax is small and as already mentioned A(r) is very 
noisy. Rough fits in different cases give a behavior for 



Eq. mh consistent with data (see Fig. 13). 




FIG. 13: N{r)/Nc versus r/rc for Moscow, Tokyo, Paris, 
and Madrid (from top to bottom and left to right). The circles 
represent the data and the green solid line the fit using Eq. ^ 
with parameters estimated from the empirical data. 

In particular, for Moscow which has long branches, we 
observe a behavior consistent with A(r) constant while 
for the other networks, we observe an increasing trend 



but an accurate estimate of r is difficult to obtain, given 
the small variation range of r — with no more than one 
decade of available data. For example, a fit over this 
decade of data gives for Paris r ~ 0.5 (with = 0.74) in 
agreement with the result obtained in [M]. Despite the 
difficulty of obtaining accurate quantitative results, more 
data is needed to have a definite answer and so far we 
can only claim that the data are not inconsistent with 
the behavior Eq. ([6|, which supports our picture of a 
long time limit network shape made of a core and radial 
branches. 



DISCUSSION 

In summary, we have observed a number of similar- 
ities between different subway systems for the world's 
largest cities, despite their geographical and historical 
differences. 

First, we have shown that the largest subway networks 
exhibit a similar temporal decrease of most fluctuations 
around their long term stable values and thus converge 
to a similar structure. We identified and characterized 
the shape of this long time limiting graph as a structure 
made of core and branches which appears to be relatively 
independent of the peculiar historical idiosyncracies as- 
sociated with the evolution of these particular cities. 

For large networks, we generally observe a fraction of 
branches of about 45% for most networks, and a ratio for 
the spatial extensions of branches to the core of about 
2. The number of branches scales roughly as the square 
root of the number of stations. The core of these different 
city networks has approximately the same average degree 
which is increasing with network size, from ^ 2 to ~ 2.4 
when N ^ 100, after which it approximately remains 
within the interval [2.3, 2.5] (with moderate fluctuations). 
The fraction of k = 2 nodes in the core is generally larger 
than 60%. 

In addition, this picture of a core with branches and an 
increasing spacing between consecutive stations on these 
branches is confirmed by spatial measurements such as 
the number of stations at a given distance r and provides 
a natural interpretation to these measures. 

The evolution of networks in general and urban net- 
works in particular represents an exciting unexplored 
problem which mixes spatial and topological properties 
in unusual and often counterintuitive ways. They require 
a specific set of indicators that describe these phenomena. 
Other data such as population density, land use activity 
distribution, and traffic flows are likely to bring relevant 
information to this problem and would undoubtedly en- 
rich our study. We believe however that the present ap- 
proach represents an important exploratory step in our 
understanding and is crucial for the modeling of the evo- 
lution of urban networks. In particular, the existence of 
unique long-time limit topological and spatial features is 
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a universal signature that fundamental mechanisms, in- 
dependent of historical and geographical differences, con- 
tribute to the evolution of these transportation networks. 
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